Reliable Optimization of Arbitrary Functions over Quantum Measurements

As the connection between classical and quantum worlds, quantum measurements play a unique role in the era of quantum information processing. Given an arbitrary function of quantum measurements, how to obtain its optimal value is often considered as a basic yet important problem in various applications. Typical examples include but are not limited to optimizing the likelihood functions in quantum measurement tomography, searching the Bell parameters in Bell-test experiments, and calculating the capacities of quantum channels. In this work, we propose reliable algorithms for optimizing arbitrary functions over the space of quantum measurements by combining the so-called Gilbert’s algorithm for convex optimization with certain gradient algorithms. With extensive applications, we demonstrate the efficacy of our algorithms with both convex and nonconvex functions.


Introduction
In quantum information science, numerous complex mathematical problems remain to be solved. Since the set of quantum states as well as quantum measurements form convex sets, various important tasks in this field, such as the calculation of ground state energy, violation of the Bell inequality, and the detection and quantification of quantum entanglement [1,2], conform to the framework of convex optimization theory. The primary tool in convex optimization is semidefinite programming (SDP) [3,4], which can be used to derive relaxed constraints and provide accurate solutions for a large number of computationally challenging tasks. However, serious drawbacks also exist for SDP including its slow computation speed and low accuracy. For instance, SDP can only compute up to four qubits in quantum state tomography (QST), while improved superfast algorithms [5] can quickly go up to eleven qubits with a higher precision. Consequently, developing more efficient algorithms in convex optimization is becoming more and more crucial as quantum technologies rapidly advance.
Recently, an efficient convex optimization algorithm [6] was proposed by Brierley et al. based on the so-called Gilbert's algorithm [7]. Concurrently, Ref. [8] used Gilbert's algorithm to investigate whether nonlocal relationships can be distinguished in polynomial time. In Ref. [9], Gilbert's algorithm was employed as a tool to satisfy certain constraints, based on which two reliable convex optimization schemes over the quantum state space were proposed. In addition, some nonconvex optimization algorithms were also brought out for QST; for instance, the one in Ref. [10] is faster and more accurate as compared to previous approaches. One notices that all these studies concern only the optimization over quantum state space, with the consideration over quantum measurement space rarely being mentioned.
In fact, various important and meaningful problems related to quantum measurements exist in convex optimization, including, for example, searching the Bell parameters in Bell-test experiments [11], optimizing the correlation of quantum measurements under different measurement settings [12][13][14][15], and maximizing the likelihood functions in quantum measurement tomography. Meanwhile, characterization of quantum measurements forms the basis for quantum state tomography [16][17][18] and quantum process tomography [19][20][21]. Therefore, convex optimization over the quantum measurement space stands as an independent yet important problem in quantum information theory. However, the space of quantum measurements is much more complex as compared to the quantum state space since it is possible to produce an infinite variety of different measurement outcomes as long as the probabilities for these outcomes sum to one. Recently, Ref. [22] proposed a method to optimize over the measurement space based on SDP, but it fails to solve complex tasks due to the intrinsic problem with SDP. Worst of all, nonconvex functions [23] easily appear in the space of quantum measurements. Unlike convex functions, local optima might be found during the process of optimization. Hence, nonconvex optimization is regarded as more difficult than convex optimization. In this work, we propose two reliable algorithms for optimizing arbitrary functions over the space of quantum measurements by combining the so-called Gilbert's algorithm for convex optimization with the directgradient (DG) algorithm as well as the accelerated projected gradient (APG) algorithm. With extensive applications, we demonstrate the efficacy of our algorithms with both convex and nonconvex functions.
This work is organized as follows: In Section 2, we propose two reliable algorithms for optimizing over quantum measurement space by combining Gilbert's algorithm with the DG and APG algorithms, respectively. The universality of our method is demonstrated by several examples with both convex and nonconvex functions in Section 3. The last Section 4 provides the conclusions.

Function Optimization
In the quantum state space Q, an arbitrary state ρ should satisfy the conditions Given a smaller convex subset C ∈ Q, Gilbert's algorithm can be used to approximately find the closest state ρ C ∈ C with respect to ρ [9]. In general, for an arbitrary matrix M in the matrix space M, we employ Gilbert's algorithm to search for the closest quantum state ρ Q ∈ Q with respect to M. Throughout this work, let us denote the operation by using Gilbert's algorithm as Given experimental data, it is critical to identify the measurement settings that are most compatible with the data. Here, we consider the quantum measurement space Ω as all the positive operator-valued measures (POVMs). A quantum measurement device is characterized by a set of operators Π l , which have to satisfy two constraints where L is the total number of operators in the set. Denote a function F Π l defined over the quantum measurement space Ω. We assume that F Π l is differentiable with the gradient ∇F Π l ≡ G Π l . The objective is to optimize F Π l over the entire quantum measurement space, and we have A simple gradient method is very likely to take Π l outside of the quantum measurement space; for this, we employ Gilbert's algorithm to guarantee the condition in Equation (4). In addition, we rewrite the POVM as Π l = Π 1 , Π 2 , . . . , Π L−1 , I − ∑ L−1 l=1 Π l to satisfy the condition in Equation (5). Then, the structure of optimization proceeds as follows.
Taking the to-be-minimized objective function as an example, for the (k + 1)th iteration, first update the (L − 1) elements foremost of the measurement operators with the DG scheme to obtain Here, represents the step size of the update which can be any positive value, and k is the number of iterations. Second, normalize the measurement operators Π l,k+1 as density matrices ρ l,k+1 , such that which could be nonphysical. Third, use Gilbert's algorithm to project ρ l,k+1 back to the quantum state space Q, i.e., ρ l,k+1 → ρ Q l,k+1 = S(ρ l,k+1 ). Finally, reconstruct the physical measurement operators as where the parameter t l is obtained by fixing the obtained ρ Q l,k+1 to obtain t l,k+1 . Here, to ensure that the first (L − 1) measurement operators satisfy condition Equation (4), only t l,k+1 ≥ 0 is required since ρ Q l,k+1 ≥ 0 is guaranteed by using Gilbert's algorithm. Meanwhile, in order to ensure that the last element of the new POVM satisfies the condition in Equation (4), let Hence, we obtain the new POVM Π Ω k+1,l that satisfies the condition in Equation (6b) after each iteration. Whenever the difference between the values of the adjacent iterations is less than a certain threshold, the iteration stops, and the optimal POVM is obtained. Otherwise, continue with the iteration and the step size is controlled by a step factor β. When F k < F k−1 , the step size is appropriately selected. When F k > F k−1 , it indicates that the step size selection is too large, and the step factor β needs to be used to adjust the step size. See the DG algorithm in Algorithm 1.
However, the DG algorithm has some disadvantages, such as slow optimization speed and low accuracy. For faster convergence, one can choose the APG algorithm [5,24]. The APG algorithm adjusts the direction of the gradient at each step, which improves the convergence speed of the algorithm. In simple terms, the APG algorithm has introduced a companion operator E l,k = Π l,k + which provides the momentum of the previous step controlled by the parameter θ, in order to update the measurement operators Π l,k = E l,k−1 − G E l,k−1 . See the specific process shown in Algorithm 2.

Applications
In this section, we demonstrate the efficacy of our algorithms by optimizing arbitrary convex as well as nonconvex functions over the space of quantum measurements.

Convex Functions
In quantum measurement tomography [25][26][27], a set of known probe states ρ m is measured to provide the information needed to reconstruct an unknown POVM Π l . The probability that the device would respond to the quantum state ρ m by producing the outcome Π l is given by Typically, the linear inversion method [28] can be used to obtain the ideal POVM, but nonphysical results are likely to be obtained. Then, the maximum likelihood estimation (MLE) [29] is proposed to reconstruct the POVM that satisfies all the conditions. However, MLE fails to return any meaningful results when the target POVM is of low rank, which is quite typical, especially in higher-dimensional spaces. These problems can be avoided by using our algorithms.
To estimate the operators Π l , we maximize the likelihood function where M is the number of different input states ρ m , and with n lm denoting the number of lth outcome when measuring the mth state ρ m , and n representing the total number of measured input states. One can see that L Π l is not strictly concave, while the log-likelihood ln L Π l is. Here, we minimize the negative log-likelihood function F Π l = − ln L Π l with To satisfy the condition in Equation (5), rewrite the objective function as The gradient of ln L Π l with respect to Π l is For numerical simulations, we mainly consider Pauli measurements which are the most commonly-used measurements in quantum information processing. Then, the cases of one qubit, one qutrit, two qubits, and two qutrits are used for the experimental setup, respectively. Specifically, the setups of these four scenarios are described below.

One Qubit
For one qubit, we take the eigenstates of σ z and the superposition states − 1 |0 z ± i|1 z as the input states. In the measurement setup, we select the projection of the spin along the x-axis, i.e.,

Two Qutrits
Finally, for the case of two qutrits, we perform a numerical simulation of the Stern-Gerlach apparatus measuring two particles with spin-1. We assume 45 different input states: and 36 superposition states. In the simulation, the device measures the projection of the spin along the x -axis, and the POVM are projectors For each case of simulation, the number of measurements for each probe state is 300, 10 5 , 10 5 , and 5 × 10 5 , respectively. Then, according to the frequency obtained by the simulated data, we use our algorithm to reconstruct the POVM. The fidelity between different POVM elements is defined as the fidelity between the two states σ and ρ, i.e., In addition, the overall fidelity between two POVMs Π l L l=1 and Π j L j=1 on a d-dimensional Hilbert space is defined by with w l = √ tr(Π l )tr(Π j ) d [30]. The overall fidelities of the reconstructed POVMs are shown in Figure 1. Figures 2 and 3 present the variations of fidelity of the POVM elements reconstructed using the DG algorithm and APG algorithm with respect to the number of iteration steps in different cases. We can see that these two algorithms are almost identical in accuracy, and the fidelities of the measurement operators are close to 1. Generally speaking, the APG algorithm converges faster than the DG algorithm. In addition, one notices that the fidelity of the last element in some of the simulations is not always increasing, which is a result of the constraint that we set in Equation (11).  Figure 1. For different cases, the two algorithms are compared to reconstruct the overall fidelity of the measurements. The number of measurements used in each simulation for each probe state is 300, 10 5 , 10 5 , and 5 × 10 5 , respectively. For most cases, the APG algorithm converges faster than the DG algorithm.

Nonconvex Functions
Quantum detector self-characterization (QDSC) tomography is another method for characterizing quantum measurements. Unlike quantum measurement tomography, this method does not require knowing the specific form of the input probe states, but directly optimizes the cost function based on the measurement statistic f m to reconstruct the measurements. For POVM with L outcomes detected by m states, a data set of the measurement statistic f lm is obtained. We write the distribution of the data for each state as a vector For the one qubit case, define N i,l = b T i b l and write the POVM as under the Bloch representation, where i and l represent the number of rows i and columns l of the matrix N, a = (a 1 · · · a L ) T , b l = (b l,x , b l,y , b l,z ), σ = (σ x , σ y , σ z ), 1 ≤ i, l ≤ L. The matrix N and vector a can be represented as Then, optimization of the cost function F N + , a is given by [23] min where N + stands for the Moore-Penrose pseudoinverse of N. One notices that the objective function is nonconvex. Optimization of nonconvex functions is difficult as local minima might be found. Interestingly, we find that our algorithm can also be used to optimize nonconvex functions. Since our algorithm guarantees the conditions for quantum measurements, one only needs to optimize the objective function regardless of the constraint in Equation (28b). For numerical simulations, we choose 50 probe states: where i = 1, 2, · · · , 6; n = 1, 2, · · · , 8. In addition, we use the two-dimensional SIC POVM as the measurement device, and each state is measured 200 times. The APG algorithm is used to optimize the objective function. First, select any set of POVM operators in the measurement space, and use Equations (26) and (27) to obtain the initial values N + k and a k , respectively. Similarly, we calculate the gradient of the objective function in Equation (28a). The gradient of the objective function is given by The values of N k+1 and a k+1 are obtained by iterating over N k and a k using gradient descent; then, b l,k+1 is obtained by decomposing N k+1 . In the experiment, we specify that the reference frame, i.e., the vector b 1 is parallel to the z-direction of the Bloch sphere, and set the xz plane of the Bloch sphere as the plane determined by the vectors b 1 and b 2 . This is equivalent to b 1,x = b 1,y = b 2,y = 0. Then, Π l,k+1 L−1 l=1 can be obtained by using Equation (25), which is the update for Π l,k L−1 l=1 . The fidelity of each POVM element can approach 1 in a very small number of iteration steps; see Figure 4. Then, the fidelities of the measurements are compared with the ones reported in [23], demonstrating that the performance of our algorithm is slightly better; see Figure 5.

Conclusions
We have proposed two reliable algorithms for optimizing arbitrary functions over the quantum measurement space. For a demonstration, we have shown several examples on the convex function of quantum measurement tomography with different dimensions as well as a nonconvex function of one qubit in quantum detector self-characterization tomography. Surprisingly, our method does not encounter the problem of rank deficiency. Compared with SDP, our method can be easily applied to higher-dimensional cases as well as to optimize nonconvex functions. Moreover, our method reports better results as compared to previous approaches. For future work, we will consider the optimization over the joint space of quantum states and quantum measurements, for tasks such as calculating the capacity of quantum channels.  Institutional Review Board Statement: Not applicable.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author upon reasonable request.